Genetic algorithm implementation for effective document subject search

نویسندگان

  • V. K. Ivanov
  • P. I. Meskin
چکیده

The quality of documentary subject search or search for documents containing specifically coordinated information on a target subject is not always satisfactory. Despite the availability of powerful search engines for information resources on the Internet or special databases, the process remains time-consuming and poorly supported by software and methodologically. This paper describes the software implementation of genetic algorithm for identifying and selecting most relevant results received during sequentially executed subject search operations. Simulated evolutionary process generates sustainable and effective population of search queries, forms search pattern of documents or semantic core, creates relevant sets of required documents, allows automatic classification of search results. The paper discusses the features of subject search, justifies the use of a genetic algorithm, describes arguments of the fitness function and describes basic steps and parameters of the algorithm. It is noted that the fitness function or quality criteria determined by the position of the document in search results built by the search engine for maximum number of different queries and semantic similarity of search pattern of documents on a given subject. Software implementation is described in detail: the general object model, user interface, the main library of the algorithm, morphological analysis module, texts similarity analysis module, search module, database management module, metadata management module. The information on module classes composition and components is provided. The paper describes genetic algorithm software implementation that is one of the elements of project Intelligent Distributed Information Management System for Innovations in Science and Education powered by the Russian Foundation of Basic Research. The algorithm plays an important role in functioning of the adaptive search engines. It is noted that developed algorithm software creates a sufficiently broad basis for further research and development.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Effective Hybrid Genetic Algorithm for Hybrid Flow Shops with Sequence Dependent Setup Times and Processor Blocking

Hybrid flow-shop or flexible flow shop problems have remained subject of intensive research over several years. Hybrid flow-shop problems overcome one of the limitations of the classical flow-shop model by allowing parallel processors at each stage of task processing. In many papers the assumptions are generally made that there is unlimited storage available between stages and the setup times a...

متن کامل

A Technique for Improving Web Mining using Enhanced Genetic Algorithm

World Wide Web is growing at a very fast pace and makes a lot of information available to the public. Search engines used conventional methods to retrieve information on the Web; however, the search results of these engines are still able to be refined and their accuracy is not high enough. One of the methods for web mining is evolutionary algorithms which search according to the user interests...

متن کامل

An Effective Genetic Algorithm for Solving the Multiple Traveling Salesman Problem

The multiple traveling salesman problem (MTSP) involves scheduling m > 1 salesmen to visit a set of n > m nodes so that each node is visited exactly once. The objective is to minimize the total distance traveled by all the salesmen. The MTSP is an example of combinatorial optimization problems, and has a multiplicity of applications, mostly in the areas of routing and scheduling. In this paper,...

متن کامل

A New Method for Intrusion Detection Using Genetic Algorithm and Neural Network

    The article attempts to have neural network and genetic algorithm techniques present a model for classification on dataset. The goal is design model can the subject acted a firewall in network and this model with compound optimized algorithms create reliability and accuracy and reduce error rate couse of this is article use feedback neural network and compared to previous methods increase a...

متن کامل

A New Method for Intrusion Detection Using Genetic Algorithm and Neural Network

    The article attempts to have neural network and genetic algorithm techniques present a model for classification on dataset. The goal is design model can the subject acted a firewall in network and this model with compound optimized algorithms create reliability and accuracy and reduce error rate couse of this is article use feedback neural network and compared to previous methods increase a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1504.04216  شماره 

صفحات  -

تاریخ انتشار 2014